Canonical Variate Analysis (CVA) biplot

Aim: Dimension reduction technique that maximises variation between classes while minimising within class variation.

This is achieved by the following tasks:

  • Decomposing Variance
  • Find a linear mapping to canonical space.
  • Find a low dimensional approximation

Variance Decomposition

The classical variance decomposition \[\mathbf{T}=\mathbf{B}+\mathbf{W},\]

has as an analogy in this setting \[ \mathbf{X'X} = \mathbf{\bar{\mathbf{X}}'C \bar{\mathbf{X}}} + \mathbf{X' [I - G(G'G)^{-1}C(G'G)^{-1}G'] X}. \]

The choice of \(\mathbf{C}\) determines the variant of CVA:

  • Weighted: \(\mathbf{C}=\mathbf{N}=\mathbf{G'G}\)
  • Unweighted: \(\mathbf{C}=\mathbf{I}_G - G^{-1}\mathbf{1}_G\mathbf{1}_G'\)
  • Unweighted \w weighted centroid: \(\mathbf{C}=\mathbf{I}_G\)

Linear Mapping

Find a linear mapping

\[\mathbf{Y}=\mathbf{X}\mathbf{M}, \tag{1}\]

such that \[\frac{\mathbf{m}'\mathbf{B}\mathbf{m}}{\mathbf{m}'\mathbf{W}\mathbf{m}} \tag{2}\] is maximised s.t. \(\mathbf{m}'\mathbf{W}\mathbf{m}=1\).

It can be shown that this leads to the following equivalent eigen equations:

\[ \mathbf{W}^{-1}\mathbf{BM} = \mathbf{M \Lambda} \tag{3} \]

\[ \mathbf{BM} = \mathbf{WM \Lambda} \tag{4} \]

\[ (\mathbf{W}^{-\frac{1}{2}} \mathbf{B} \mathbf{W}^{-\frac{1}{2}}) \mathbf{M} = (\mathbf{W}^{-\frac{1}{2}} \mathbf{M}) \mathbf{\Lambda} \tag{5} \]

with \(\mathbf{M'BM}= \mathbf{\Lambda}\) and \(\mathbf{M'WM}= \mathbf{I}\).

Since the matrix \(\mathbf{W}^{-\frac{1}{2}} \mathbf{B} \mathbf{W}^{-\frac{1}{2}}\) is symmetric and positive semi-definite the eigenvalues in the matrix \(\mathbf{\Lambda}\) are positive and ordered. The rank of \(\mathbf{B} = min(p, G-1)\) so that only the first \(rank(\mathbf{B})\) eigenvalues are non-zero. We form the canonical variates with the transformation

\[ \bar{\mathbf{Y}} = \bar{\mathbf{X}}\mathbf{M}.\tag{5} \]

Low dimensional approximation

The first two canonical variates are given by:

\[\mathbf{\bar{Z}}=\mathbf{\bar{Y}}\mathbf{J}_2=\mathbf{\bar{X}}\mathbf{M}\mathbf{J}_2 \tag{6}\] where \(\mathbf{J'}_2=[\mathbf{I}_2 \quad \mathbf{0}]\). We add the individual sample points with the same transformation \[\mathbf{Z}=\mathbf{X}\mathbf{M}\mathbf{J}_2. \tag{7}\]

A new sample point, \(\mathbf{x}^*\), can be added by interpolation \[\mathbf{z}^*=\mathbf{x}^*\mathbf{M}\mathbf{J}_2.\tag{8}\]

CVA function

CVA()
Argument Description
bp Object of class biplot.
classes Vector of class membership. User specified, otherwise defaults to vector specified in biplot.
dim.biplot Dimension of the biplot. Only values 1, 2 and 3 are accepted, with default 2.
e.vects Which eigenvectors (principal components) to extract, with default 1:dim.biplot.
weightedCVA “weighted” or “unweightedCent” or “unweightedI”: Controls which type of CVA to perform, with default "weighted"
show.class.means TRUE or FALSE: Controls whether class means are plotted, with default TRUE.
low.dim "sample.opt" or "Bhattacharyya.dist": Controls method of constructing additional dimension(s) if dim.biplot is greater than the number of classes, with default "sample.opt".

Class means

The means() function allows the user to make adjustments to the points representing the class means.

Argument Description
bp an object of class biplot.

Plotting only a selection of the class means

Argument Description
which a vector containing the groups or classes for which the means should be displayed, with default bp$g.
biplot(state.x77) |> CVA(state.region) |> means(which = c(2,3),label = TRUE) |> plot()

Class means

Colours and characters for the points

The following arguments control the aesthetic options for the plotted class mean points:

Argument Description
the colour(s) for the means, with default as the colour of the samples.
pch the plotting character(s) for the means, with default 15.
cex the character expansion(s) for the means, with default 1.
opacity transparency of means.
shade.darker a logical value indicating whether the colour of the mean points should be made a shade darker than the default or specified colour, with default TRUE.

Class means

Colours and characters for the points

biplot(state.x77) |> CVA(state.region) |> means(cex = c(1,2,3,4),col = "red",pch = c(9,13,4,16)) |> plot()

Class means

Labels

The following arguments control the aesthetic options for the labels accompanying the plotted class mean points:

Argument Description
label a logical value indicating whether the means should be labelled, with default TRUE.
label.col a vector of the same length as which with label colours for the means, with default as the colour of the means.
label.cex a vector of the same length as which with label text expansions for the means, with default 0.75.
label.side the side at which the label of the plotted mean point appears, with default bottom.
label.offset the offset of the label from the plotted mean point.

Class means

Labels

biplot(state.x77) |> CVA(state.region) |> means(label = TRUE,label.side = "top",label.offset = 2,label.cex = 1) |> plot()

Classification regions

This function creates classification regions for the CVA biplot.

The classify() function appends the biplot object with the following elements:

  • A confusion matrix from the classification into classes

  • The classification accuracy rate

  • A logical value indicating whether classification regions are shown in the biplot

  • A list of chosen aesthetics for the classification regions

  • The midpoints of the classification regions

Classification regions

Classification regions in the CVA biplot

class.examp<-biplot(state.x77) |> CVA(state.region) |> classify(col = c("cornflowerblue","darkolivegreen3","darkgoldenrod","aquamarine"))
#class.examp$classify
class.examp |> plot()

\(\alpha\)-bags containing a percentage of observations

This function creates \(\alpha\)-bags

The alpha.bags() function appends the biplot object with the following elements:

  • A list of coordinates for the \(\alpha\)-bags for each group

  • A vector of colours for the \(\alpha\)-bags

  • A vector of line types for the \(\alpha\)-bags

  • A vector of line widths for the \(\alpha\)-bags

\(\alpha\)-bags containing a percentage of observations

The \(\alpha\)-bags in the CVA biplot

ab.examp<-biplot(state.x77) |> CVA(state.region) |> alpha.bags(alpha = c(0.85,0.9,0.95,0.99),lty = c(1,2,3,4))
# Computing 0.85 -bag for Northeast 
# Computing 0.9 -bag for South 
# Computing 0.95 -bag for North Central 
# Computing 0.99 -bag for West
#ab.examp$alpha.bags
ab.examp |> plot()

Concentration ellipses

This function creates \(\kappa\)-ellipses

The ellipses() function appends the biplot object with the following elements:

  • A list of coordinates for the \(\kappa\)-ellipses for each group

  • A vector of colours for the \(\kappa\)-ellipses

  • A vector of line types for the \(\kappa\)-ellipses

  • A vector of line widths for the \(\kappa\)-ellipses

  • A vector of \(\alpha\) values

Concentration ellipses

Concentration ellipses in the CVA biplot

kc.examp<-biplot(state.x77) |> CVA(state.region) |> ellipses(alpha = c(0.85,0.9,0.95,0.99),lwd = c(1,2,3,4))
# Computing 1.95 -ellipse for Northeast 
# Computing 2.15 -ellipse for South 
# Computing 2.45 -ellipse for North Central 
# Computing 3.03 -ellipse for West
#kc.examp$conc.ellipses
kc.examp |> plot()

Measures of fit

bp <- biplot(state.x77) |> CVA(classes = state.region) |> fit.measures()

Contains the following information on how well the biplot represents the information of the original and canonical space:

  • quality: Quality of fit for canonical and original variables
  • adequacy: Adequacy of original variables
  • axis.predictivity: Axis predictivity
  • within.class.axis.predictivity: Class predictivity
  • within.class.sample.predictivity: Sample predictivity

Summary of measures of fit

The summary() function prints to screen the fit.measures stored in the object of class biplot.

bp |> summary()
# Object of class biplot, based on 50 samples and 8 variables.
# 8 numeric variables.
# 4 classes: Northeast South North Central West 
# 
# Quality of fit of canonical variables in 2 dimension(s) = 91.9% 
# Quality of fit of original variables in 2 dimension(s) = 93.4% 
# Adequacy of variables in 2 dimension(s):
#  Population      Income  Illiteracy    Life Exp      Murder     HS Grad 
# 0.453533269 0.105327455 0.107221535 0.002201286 0.208653101 0.687840023 
#       Frost        Area 
# 0.452308013 0.118544323 
# Axis predictivity in 2 dimension(s):
# Population     Income Illiteracy   Life Exp     Murder    HS Grad      Frost 
#  0.9873763  0.9848608  0.8757913  0.9050208  0.9955088  0.9970346  0.9558192 
#       Area 
#  0.9344651 
# Class predictivity in 2 dimension(s):
#     Northeast         South North Central          West 
#     0.8031465     0.9985089     0.6449906     0.9988469 
# Within class axis predictivity in 2 dimension(s):
# Population     Income Illiteracy   Life Exp     Murder    HS Grad      Frost 
# 0.02246821 0.10349948 0.27870637 0.21460313 0.29836047 0.87510975 0.22320989 
#       Area 
# 0.13603927 
# Within class sample predictivity in 2 dimension(s):
#        Alabama         Alaska        Arizona       Arkansas     California 
#    0.769417280    0.174566384    0.328610375    0.148035077    0.103141908 
#       Colorado    Connecticut       Delaware        Florida        Georgia 
#    0.357627854    0.079176621    0.438089663    0.327270922    0.558038750 
#         Hawaii          Idaho       Illinois        Indiana           Iowa 
#    0.029173037    0.167543892    0.076948041    0.473148418    0.592667777 
#         Kansas       Kentucky      Louisiana          Maine       Maryland 
#    0.774719240    0.439306768    0.190654770    0.086183357    0.284829878 
#  Massachusetts       Michigan      Minnesota    Mississippi       Missouri 
#    0.428103056    0.188094295    0.644844800    0.163103449    0.719255739 
#        Montana       Nebraska         Nevada  New Hampshire     New Jersey 
#    0.239142302    0.671350698    0.015766988    0.386053551    0.207503850 
#     New Mexico       New York North Carolina   North Dakota           Ohio 
#    0.012872885    0.008101305    0.872322617    0.457852394    0.092634247 
#       Oklahoma         Oregon   Pennsylvania   Rhode Island South Carolina 
#    0.561156131    0.158926944    0.261838286    0.482912999    0.229047767 
#   South Dakota      Tennessee          Texas           Utah        Vermont 
#    0.095865021    0.237667483    0.121494852    0.349495632    0.256983459 
#       Virginia     Washington  West Virginia      Wisconsin        Wyoming 
#    0.453608981    0.044780371    0.346223950    0.544998639    0.174849092

Rotation

The rotate() function rotates the samples and axes in the biplot by rotate.degrees degrees.

par(mfrow=c(1,2))
bp |> plot()
bp |> rotate(rotate.degrees=90)|> plot()

Reflection

The reflect() function reflects the samples and axes in the biplot along an axis, x(horisontal reflection), y (vertical reflection) or xy (diagonal reflection).

par(mfrow=c(1,2))
bp |> plot()
bp |> reflect(reflect.axis ="y")|> plot()

The argument zoom=TRUE in plot()

The argument zoom= is FALSE by default. If zoom=TRUE a new graphical device is launched. The user is prompted to click on the desired upper left hand and lower right hand corners of the zoomed in plot.

bp |>  plot(zoom=TRUE)

1 Dimensional biplot CVA of state.x77 data

biplot(state.x77,classes=state.region) |> CVA(dim.biplot=1) |> classify() |> plot()

1 Dimensional biplot PCA of iris data

biplot(iris,group.aes=iris$Species) |> PCA(dim.biplot=1) |> density1D() |> ellipses() |> plot()
# Computing 1.96 -ellipse for setosa 
# Computing 1.96 -ellipse for versicolor 
# Computing 1.96 -ellipse for virginica

3 Dimensional biplots

The dim.biplot argument can be set to 3 to allow the user to create a 3D biplot. The plot() function makes use of the RGL device for the 3D display.

3D PCA biplot of the iris data

biplot(iris) |> PCA(group.aes = iris[,5],dim.biplot = 3)|> plot()

3 Dimensional biplots

3D biplot of the state.x77 data with class means

biplot(state.x77) |> CVA(classes = state.region,dim.biplot = 3) |> means(col = "red",cex = 5) |> plot()